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Abstract 

An open problem in numerical analysis is to explain why molecular 
dynamics works. The difficulty is that numerical trajectories are only 
accurate for very short times, whereas the simulations are performed 
over long time intervals. It is believed that statistical information from 
these simulations is accurate, but no one has offered a rigourous proof 
of this. In order to give mathematicians a clear goal in understanding 
this problem, we state a precise mathematical conjecture about molec- 
ular dynamics simulation of a particular system. We believe that if the 
conjecture is proved, we will then understand why molecular dynamics 
works. 



1 Introduction 

Molecular dynamics is the computer simulation of a material at the atomic 
level. In principle the only inputs to a simulation are the characteristics of 
a set of particles and a description of the forces between them. An initial 
condition is chosen and from these first principles the evolution of the system 
in time is simulated using Newton's laws and a simple numerical integrator 

mm- 

Molecular dynamics is a very prevalent computational practice, as a 
glance at an issue of the Journal of Chemical Physics will show. It does have 
its limitations: the motion of only a relatively small number of particles can 
be simulated over a short time interval. However, most of the mesoscopic 
models that have been suggested to overcome these difficulties still rely on 
molecular dynamics as a form of calibration. It is likely that molecular 
dynamics will continue to be important in the future. 

Given its scientific importance there is very little rigourous justification 
of molecular dynamics simulation. Prom the viewpoint of numerical analysis 



it is surprising that it works at all. The problem is that individual trajecto- 
ries computed by molecular dynamics simulations are accurate for only small 
time intervals. As we will see in Section [3j numerical trajectories diverge 
rapidly from true trajectories given the step-lengths used in practice. No 
one disputes this fact, and no one is particularly concerned with it either. 
The reason is that practitioners are never interested in particular trajec- 
tories to begin with. They are interested in ensembles of trajectories. As 
long as the numerical trajectories are representative of a particular ensemble 
of true trajectories, researchers are content. However, that this statistical 
information is computed accurately has yet to be rigourously demonstrated 
in representative cases. 

The goal of this article is to present a concise mathematical conjecture 
that encapsulates this fundamental difficulty. We present a model system 
that is representative of systems commonly simulated in molecular dynam- 
ics. We present the results of numerical simulations of this system using 
the Stormer-Verlet method, the work-horse of molecular dynamics. In each 
simulation a random initial condition is generated, an approximate trajec- 
tory for the system is computed and the net displacement of one particle 
over the duration of the simulation is recorded. We show that even for 
step-sizes that are far too large to accurately compute the position of the 
particle, the distribution of the particle's displacement over the many initial 
conditions appears to be accurate. From the numerical data we conjecture a 
rate of convergence for this particular statistical property. We believe that 
if this conjectured rate of convergence (or one like it) can be rigourously es- 
tablished, even for this single system, then we will understand significantly 
better why molecular dynamics works. 

The problem of explaining the accuracy of molecular dynamics simula- 
tion is well-known both in the physical sciences (for example [61 p. 81]) 
and in the mathematics community [12]. This latter reference is a survey 
of the relation between computation and statistics for initial value problems 
in general. There has been plenty of excellent mathematical work that has 
done much to explain various features of this type of simulation, but has 
not resolved the issue we consider here. See [HI [EH E] for surveys. 

One body of work that has addressed the statistical accuracy of under- 
resolved trajectories in a special case is by A. Stuart and co-workers. In 
[3j [T7] they have explored some linear test systems with provable statistical 
properties in the limit of large numbers of particles. They are able to show 
that if the systems are simulated with appropriate methods the statistical 
features of numerical trajectories are accurate in the same limit even when 
the step-lengths are too large to resolve trajectories. Though these results 
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are interesting since they are the only ones of their kind now known, for the 
highly nonlinear problems of practical molecular dynamics very different 
arguments will be required. 

One subproblem that has been attacked more successfully is that of the 
computation of ergodic averages. These are averages of functions along 
very long trajectories. All that numerical trajectories have to do to get 
these correct is sample the entire phase space evenly. This is a much weaker 
property than getting all statistical features correct. The most striking work 
on this question is by S. Reich [TT] which establishes rapid convergence of 
ergodic averages for Hamiltonian systems which are uniformly hyperbolic 
on sets of constant energy. Unfortunately, this property has never been 
established for realistic systems, and is unlikely to hold for them [9], [10] - The 
work }18j established similar results for systems with much weaker properties 
but requires radically small time steps for convergence to occur. 

The contribution of this work is to precisely specify a simple problem 
which encapsulates all the essential difficulties of the more general problem. 
In Section [2] we present the system we will study. Section [3] shows the 
results of some numerical experiments on this system. There we state our 
conjecture based on the results. In Section 0] we will discuss two possible 
approaches to proving the conjecture. Finally, in Section [5] we will discuss 
prospects for the eventual resolution of the conjecture. 



2 The System 

The system consists of n = 100 point particles interacting on an 11.5 by 11.5 
square periodic domain. We let q £ T 2n and p € M. 2n denote the positions 
and velocities of the particles, with q^ E T 2 ,pi € M 2 denoting the position 
and velocity of particle i. The motion of the system is described by a system 
of Hamiltonian differential equations: 

dq _ dH dp _ dH 
dt dp ' dt dq' 

with Hamiltonian 

H(q,p) = 1 -\\p\\ 2 2 + Y J VLj(h t ~ q'W). 

i<j 

Here Vlj denotes the famous Lennard-Jones potential. In our simulations 
we use a truncated version: 

Vr r(r) = / 4 (p* ~ pr) ! lf r - rcutoff ' 
Ljy ' \ 0, otherwise. 
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Figure 1: The positions of the particles for a representative state of the 
system. 

Figure Q] shows the positions of the particles on the periodic domain for one 
state of the system. Though the particles are only points, in the figure each 
is represented by a circle of radius 1/2. 

We take our initial conditions q°, p° to be randomly distributed according 
to the probability density function 

Z- 1 e - H to*V* T , (1) 

where Z is chosen so that the function integrates to one. This is known as 
the canonical distribution (or ensemble) for the system at temperature T. 
There is a simple physical interpretation of this distribution: if the system 
is weakly connected to another very large system at temperature T, this is 
the distribution we will find the original system in after a long period of 
time. In our units k = 1, and we choose T = 1. 

There are many ways of sampling from the canonical distribution at 
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a given temperature. For our experiments we generated initial conditions 
using Langevin dynamics. See [I] for an explanation of this technique and 
a comparison with other methods. If done correctly, the precise method of 
sampling from the canonical distribution will have no bearing on the results 
of the experiments we will present subsequently. 

The numerical method we use for integrating our system is the Stormer- 
Verlet scheme. Given an initial qo,po and a At > it generates a sequence 
of states q n ,p n ,n > such that (q n ,Pn) ~ (q(nAt),p(nAt)). The version of 
the algorithm we use is 

q n + p n At/2, 
p n - AtW(q n+1/2 ), 

'In 

This is a second-order explicit method. It is symplectic, and as a conse- 
quence conserves phase space volume [TJ. 

Finally we have to decide upon our step-length At. If At is too large 
the energy of the computed solution will increase rapidly and explode. In 
practice, it is observed that for small enough step lengths energy remains 
within a narrow band of the true energy for very long time intervals. (There 
is extensive theoretical justification for this phenomenon, see Section [4 . 1 [) . 
Practitioners tend to pick a At as large as possible while still maintaining 
this long-term stability on their time interval of interest. For the system and 
initial conditions we describe here At = 0.01 yields good approximate energy 
conservation on the time interval [0, 100]. For our numerical experiments we 
will let At take this value and smaller. (The recommended value in [6], a 
standard reference, for this type of system is At = 0.005.) 

3 The Problem 

We will first examine how well trajectories are computed with At = 0.01. 
Figure [2] shows the computed x-position of one particle versus time for the 
same initial conditions and for a range of step-lengths. If the trajectory 
computed by Stormer-Verlet is accurate over the time interval [0,5], we 
expect that reducing the time step by a factor of a thousand would not 
yield a significantly different curve. However, we see that the two curves for 
At = 0.01 and At = 0.00001 very quickly diverge. They are distinguishable 
to the eye almost immediately and completely diverge around 1.2 time units. 

Reducing the step length to At = 0.001 gives a curve that agrees with the 
At = 0.00001 line longer, but still diverges around 2.5 time units. Similarly, 



Qn+l/2 = 
Pn+1 = 
Qn+1 = 
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Figure 2: Computed x-position of one particle versus time for fixed initial 
conditions for a range of At. 

even with At = 0.0001 trajectory is not accurate over the whole interval 
depicted. 

From these numerical results, we might conjecture that reducing the 
step-length by a constant factor only extends the duration for which the 
simulation is accurate by a constant amount of time. This is consistent with 
theoretical results about the convergence of numerical methods for ordinary 
differential equations. What is surprising in this case is that the time-scale 
on which the trajectories are valid appears to be miniscule compared to the 
time-scale on which computation are actually performed. It seems that the 
trajectories we compute here with stepsize even as small as At = 0.00001 
are not accuarate over the whole interval [0,5] let alone over considerably 
longer intervals. 

Fortunately we almost never care about what one particular trajectory 
is doing in molecular dynamics. We only care about statistical features of 
the trajectories when initial conditions are selected according to some prob- 
ability distribution. Here we will consider the example of self-diffusion. Self- 
diffusion is the diffusion of one particular particle through a bath of identical 
particles. We can imagine somehow marking one particle at time zero and 
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Figure 3: Displacement in x direction of 1 particle at T = 10 for three 
different step-lengths. 

watching its motion through the system. This single-particle trajectory will 
depend on the positions and velocities of all the particles (including itself) 
at time zero. Since these are random, the trajectory of the single particle is 
random. 

One way to measure self-diffusion is to look at the distribution of the in- 
coordinate of the tracer particle relative to its initial condition. To estimate 
this, we generate many random initial conditions, perform the simulation 
using the Stormer-Verlet method, and record the net displacement of the 
particle in the given direction. Figures [3] show the histograms of these dis- 
placements at time T = 10 for three different step-lengths. 

In contrast to the case where we examined single trajectories, here the 
histograms are virtually identical for the different step- lengths. This sug- 
gests that any information we glean from the first histogram will be accurate. 

To check this more carefully, we compute the variance of the total 
displacement at various times T for varying step- lengths. Let R{T) = 
||<7i(T) — <7i(0)|| denote the total displacement of the particle after time 
T. This is a random quantity through its dependence on the state of the 
system at t = 0. Let B.At(T) denote this same displacement as simulated 
with the Stormer-Verlet method. This also is a random quantity. Now de- 
fine (R\ t (T)) to be the expected value of R\JT) when the initial conditions 
are chosen according to the canonical distribution. Let us see how this last 
quantity depends on At. We do this by generating many initial conditions 
from the canonical ensemble and then simulating the system for 100 time 
units, keeping track of the total displacement of the tracer particle. 

Figure S] shows (R\JT)) versus T for three choices of step length. The 
inset shows a subset of the data with error bars. Up to the sampling error 
there is no difference between the curves. As far as we can tell from this 
plot, the answers for At = 0.01 are accurate. The time-scale is much larger 
than the short interval we found the trajectory to be accurate over. Lest 
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we give the impression that {R\ t (T)) depends linearly on T, Figure [5] shows 
the same results for a smaller time interval. 

We conjecture that the reason (R 2 At (T)) does not appear to depend on 
At is that even for these large values of At it closely matches (R 2 (T)). It is 
not clear at all what the rate of convergence of R&t(T) to R(T) is and how 
it depends on T. However we make the following conjecture: 

Conjecture 1 For the system described in Section^ with the initial distri- 
bution given by |7]j and the Stormer-Verlet integrator with time step At 

\(R 2 At (T))-(R 2 (T))\<CAt 2 , 

for allT E [0,Ae B / At ], for some constants A, B, C. 

We will explain the reasons for hypothesizing this particular dependence in 
the next section. Here we will briefly note what dependence the classical 
theory of convergence for numerical ODEs gives: 

\{R 2 At(. T )) ~ (R 2 (T))\ < Ce LT At 2 

for T G [0,Elog(F/At)] for sufficiently small At for some C,L,E,F > 0. 
(See |16l p. 239], for example.) So we need to explain why the error remains 
so small even for long simulations. 
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Figure 5: Same as Figure [4] but on a smaller time interval. 



4 Two Approaches 

We will discuss two possible approaches to proving Conjectured! backward 
error analysis and shadowing. 



4.1 Backward Error Analysis 

Typically a pth order numerical method applied to a system of ODEs com- 
putes a trajectory that is 0(At p ) close to the exact trajectory on a finite 
interval. Backward error analysis is a way of showing that the numerical 
trajectory is an C(exp(— 1/At)) approximation to the exact trajectory of a 
perturbed system. This result can be used in turn to prove results about 
the stability of the numerical trajectory. See [2] for an early reference and 
[7J Ch. IX.] for a recent comprehensive treatment of the subject. 

If we apply a symplectic integrator to a Hamiltonian system it turns out 
that the modified system is also Hamiltonian. The Hamiltonian function 
H for the new system can be written as H = H + 0(At 2 ). There are two 
consequences for us. Firstly, the numerical method agrees very closely with 
the exact solutions of the modified Hamiltonian on short time intervals. If we 
denote the solution to the modified system with the same initial conditions 
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by (q,p) then 

\q(nAt) -q n \ < Ce~ D l At (2) 

for T € [0,-B/Ai], for some appropriate constants [5]. (This alone is not 
useful for analysing molecular dynamics since T and At are both large.) 
Secondly, the modified Hamiltonian H is conserved extremely well by the 
numerical method for long time intervals: 

H(q ,p°)-H(q n ,p n )\ <Ce' D / At , 

for nAt £ [0, Ae B / At }. Putting this together with H = H + 0(At 2 ) gives 
\H(q°,p°) -H{q n ,p n )\ < EAt 2 , 

for nAt G [0, Ae B l At \. We chose the bound in Conjecture 1 in analogy with 
this last result. 

Suppose we wanted to bound the error between (R 2 At (t)) and (R 2 (t)) 
using these estimates. The fact that the initial conditions are random adds 
an extra level of complication to the problem. We have been using (•) 
to denote the average with respect to the canonical distribution for the 
Hamiltonian H. The perturbed Hamiltonian H has a different canonical 
distribution. We denote averages with respect to it by (•)'. We let R denote 
the net displacement of the tracer particle under the new flow given by H. 

We might try bounding the error in the following way: 

\(R 2 At (T))-(R 2 (T))\ < \(Rl t (T)) - (R 2 (T))\ 

+ \(R 2 (T))-(R 2 (T))'\ 
+ \(R 2 (T)Y - (R 2 (T))\ 

We discuss each of the three terms in turn. 

The first term is due to the numerical trajectory not agreeing with the 
exact trajectory of the modified system with Hamiltonian H. According 
to ([2]) we can bound this term by Cexp(— D / At) for a duration of B/At. 
The studies in [5] suggest that this is a tight estimate for typical molecular 
dynamics simulations. 

The second term is the difference in the expectation of R 2 (t) due to 
a perturbation in the measure. Since the two measures are proportional 
to exp(—H/kT) and exp(—H/kT) respectively, and H — H = 0(At 2 ), we 
expect this term to be on the order of 0(At 2 ) for all T. This probably can 
be rigourously controlled without much difficulty. 
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The third term is just the difference in (R 2 (t)} between the original sys- 
tem and the perturbed system. This is likely to be extremely difficult to 
bound. However, showing that it is small is not a question about computa- 
tion but about statistical physics. For now let us assume that it is 0(At 2 ) 
for all T for now. 

Already we can see that this approach will not get us the result that 
we want, even assuming we can bound the third term. The best estimate 
we have so far is that the error is bounded by 0(At 2 ) for T £ [0, B/At]. 
The bound would hold on an interval much shorter than what is needed. 
It appears that backward error analysis alone cannot explain the observed 
convergence. 

4.2 Shadowing 

The idea of shadowing is complementary to that of backward error analy- 
sis. Whereas backward error analysis shows that the numerical trajectory is 
close to the exact trajectory of a different Hamiltonian system with the same 
initial condition, shadowing attempts to show that the numerical trajectory 
is close to an exact trajectory of the same Hamiltonian system with a differ- 
ent initial condition. See [8j for a nice review of shadowing for Hamiltonian 
systems. 

In our situation, if shadowing were possible, something like the follow- 
ing would hold. Suppose we compute a numerical trajectory starting from 
(q°,p°) with time step At, which we denote by (q n ,p n ),n > 0. If shad- 
owing is possible then there is an exact trajectory (q(t),p(t)) of the same 
Hamiltonian system starting at some other initial condition (q(0),p(0)) such 
that 

(q n ,p n )^(q(nAt),p(nAt)) 

for nAt in some large range of times. Assuming that it is possible to shadow 
every numerical trajectory in this way, let us denote the map on the phase 
space that takes the numerical initial condition to the initial condition of 
the shadow trajectory by 

SAt(q°,P°) = (q(0),P(0)). 

The idea of shadowing is used very effectively by S. Reich in [11]. For a 
Hamiltonian system for which shadowing holds he demonstrates that long- 
time averages will be computed accurately by almost all numerical trajec- 
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tories. That is, 



T 1 N 

g(q(t),p(t))dt « lim ^ 5>(g n ,p n ), (3) 

n=0 

for almost all initial conditions (g°,p ) = (g(0),p(0)), for reasonable func- 
tions g. Since the quantity on the left does not depend on (g(0),p(0)) in the 
systems considered in [11] (except for sets of measure zero), it is sufficient 
that such a map Sa* exists to get the result. 

In our case we are interested in more general statistical features of tra- 
jectories than long-time averages. For example, the variance of the displace- 
ment of a single particle in a finite time interval cannot be put into the form 
of a long-time average such as in [3l This puts more stringent requirements 
on S/s.t- To show that statistics are captured correctly we cannot consider 
just single trajectories; we have make sure the entire ensemble's statistics 
are reproduced correctly. If the shadowing map S^t systematically picked 
initial conditions for which the tracer particle tended to move to the left, 
for example, then the computed statistics could be quite inaccurate. See [8] 
for a discussion of this issue in the context of astrophysics. What is neces- 
sary for this shadowing to work is for S^t to leave the canonical ensemble 
invariant: 

(G(q,p)} = (G(S At (q,p))) (4) 

for some suitably broad class of functions G on phase space. This is an even 
more stringent requirement than just that shadowing is possible at all, and 
it may be quite unlikely to hold for our system. 

Fortunately we can weaken some other requirements demanded of shad- 
owing considerably for our problem. We do not need the trajectory of the 
whole system to be close; we only need the trajectory of a single particle to 
be close. Suppose that our tracer particle's numerical trajectory is denoted 
by (qi,Pi) for n > 0. We say that weak shadowing holds if we can select 
q(0), p(0) such that 

(q^,p r l)^(qi(nAt)),p 1 (nAt))) 

for nAt in some long range of times. 

To see how this fits in with the conjecture suppose that we have both 
(|4|) and 

\\(q^p n 1 )-(q 1 (T)),p 1 (T))\\<CAt 2 . (5) 



lim — 
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for T = nAt € [0, Ae B '^\. This means that (assuming we can obtain 
reasonable bounds on R\ t {T) and R 2 (T)) that 

\(Rl t (T))-(R 2 (T))\ < K\{\\fi\\) - (\\<n(t)\\)\ 

< K\(\\Qi\\) ~ (\MT)\\)\ + K\(\MT)\\) - (\\qi(T)\\)\ 

< K(\\(q n 1 ,p n 1 )-(q 1 (T)),p 1 (T))\\) 
+K\(G(S At (q°,p ))) - {G(q ,p ))\, 

for T £ [0, Ae B l^ l \. Here we have let G be the composition of the time T 
flow map of the Hamiltonian system with the 2-norm. Now the first term 
above is bounded by CTe~ D / A * by ([5]) and the second term is by (HJ), 
thus establishing the conjecture. Simultaneously proving @ and (0) for 
some shadowing map S^t may not be easy, but it may be much easier than 
proving the usual stronger shadowing result. 

5 Discussion 

Despite the ideas presented in the previous section, the conjecture we have 
presented is probably not open to attack by existing techniques. The prob- 
lem is that there is no rigourous mathematical theory of how statistical 
regularities emerge from the dynamics of generic high-dimensional Hamilto- 
nian systems. Consequently, there is no theory of how perturbations in the 
Hamiltonian dynamics leads to perturbation in the statistics. A numerical 
analyst has three choices when faced with this situation: 

1. Take Up Mathematical Physics. If we are to make progress on the 
conjecture these entirely non- numerical problems need to be tackled 
first. Mathematical physicists are interested in proving things like 
ergodicity and decay of correlations for Hamiltonian systems such as 
presented here, and it is conceivable that eventually there will a robust 
body of theory that we can apply to our problem. So one possibility 
is to work on developing such a theory. This likely will not have much 
to do with computation. 

2. Relax Standards of Rigour. Theoretical physicists, as opposed to 
mathematical physicists, have accepted that much reliable informa- 
tion can be obtained through calculations that cannot be rigourously 
justified. Typically theoretical physicists study systems about which 
nothing interesting can be proved; to do otherwise would be far too 
restrictive. There is no reason why this informal yet highly fruitful 



13 



style of reasoning should be restricted to systems themselves and not 
numerical discretizations of systems. A combination of non-rigourous 
arguments and careful numerical experiments could do a lot to clar- 
ify how the Stormer-Verlet method is able to compute statistics so 
accurately for our system. 

3. Abandon the Whole Pursuit. For many, the purpose of numerical 
analysis is to provide reliable, efficient algorithms. If one is pursuing a 
theoretical question, it is hoped that it will lead to better algorithms 
eventually. Sadly, even a complete resolution of the conjecture we have 
presented in unlikely to have much effect on computational practice. 
Many people have tried for years to devise an integrator that is more 
efficient than the Stormer-Verlet method for computing statistically 
accurate trajectories in molecular dynamics. They have only been 
successful for Hamiltonian systems with special structure. (The prime 
example of this is the multiple time stepping methods, see [7, Ch. 
VIII. 4].) In fact, we state another conjecture which is not formulated 
rigourously. 

Conjecture 2 No integration scheme can improve the efficiency by 
more than a factor of two with which Stormer- Verlet computes statis- 
tically accurate trajectories for systems like that in Section \M 

Here even a clear mathematical formulation would be a challenge. 
Obviously if we already know a lot about a system we can contrive an 
algorithm which will give correct statistics for a tracer particle, but 
this does not count. The conjecture is intended to capture the idea 
that Stormer-Verlet is a very general purpose method; we do not need 
to know anything about a system to apply it. 

At the Abel Symposium participants seemed to prefer the first of the 
three options: try to prove what one can about the system and its dis- 
cretization. 
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